IMPORT DATA opens a file and attempts to import data from it. If the information in the file is formatted as explained below, ViSta will load the data and display a data icon on the workmap.

For a file to be an importable datafile, it must be a multiple line file, where each line has the same number of pieces of information, called elements. If there are N variables, every line must have N+1 elements. An element can be either a number, a symbol, or a string. A symbol is an unquoted group of keyboard characters containing no white space, while a string is a double-quoted group of characters which may contain white space. Symbols are converted to upper case, strings retain original case. Missing data elements are represented by the symbol nil or NIL, but NOT by the string "nil" (regardless of case).


TYPES OF LINES IN THE FILE

There are three types of lines which may appear in an importable datafile. These lines are called the NAMES, TYPES and DATA lines. While there is some flexibility (see below), it is recommended that the DATA lines be preceeded by a NAMES line and a TYPES line, each of which can only have elements which are character strings. Files which do not have these lines do not follow the recommended structure and may be more time consuming to process, or may not be importable.

  NAMES: The NAMES line, which is first, begins with a character string which specifies the name of the exporter and of the dataset. This string is a two-part string: The first part identifies the exporter and the second part names the file. The parts are separated by a colon. Thus, two examples are "ViSta:Cars" and "Excel:Fish". The line then continues with N more character strings which are used to define variable names. 

  TYPES: The TYPES line, which is second, defines the dataset type (first) and the types of each variable (the rest). The NAMES and TYPES lines can only have elements which are character strings. The NAMES can be any string of characters. The DATA TYPES can be one of the following: "category", "univariate", \bivariate", "multivariate", "classification", "frequency", "freqclass", "crosstabs", "general", "missing", "symmetric" or "asymmetric".  . The VARIABLE TYPES can be either "category", "ordinal", or "numeric". Capitalization is ignored for all data and variable types.

  DATA: Each of the remaining lines of the file forms a row of the data. The first element of each row is a "label" which is used to identify a data observation (in the  case of "symmetric", or "asymmetric" data the label identifies a matrix of data). These first element of a row may be a string, symbol or number, but since labels are strings, the first element of a data line is converted to a string. The remaining elements of each line are the data. A data element must be a number if its' variable is "numeric", but can be a number, symbol or string if its' variable is a "category" variable.


THE MEANING OF NUMERIC ELEMENTS

Elements of numeric variables are numbers, and numbers carry different meanings in different situations. Depending on the data being imported, you may be asked to indicate what the numbers represent. The variety of meanings can be grouped together into three kinds of meanings, as follows: 

  QUANTITY: A number can represent the quantity or ammount of some characteristic. 

  FREQUENCY: A number can also represent the frequency of something, or the count of something. 

  ASSOCIATION: A number can also tell of the degree of association or correlation between two things, or the distance between a pair of things. 



TYPES OF IMPORTABLE FILES

  VISTA EXPORT FILES: Files which follow the recommended structure given above are called ViSta Export Files.

  ANONYMOUS EXPORT FILE: These files follow the recommended structure given above, but the first character string in the file is a one-part string. It is assumed that the string names the dataset, and does not identify the exporting program. 

  UNTYPED EXPORT FILE:  These files have only a NAMES line preceeding the DATA lines. No TYPES line appears. When a file has only one initial line with elements which are all character stings is assumed to be an untyped export file. For such a file it is assumed that the first column of values (i.e., the first value on each line) is a LABEL.  The data and variable TYPES are determined from the nature of the data (a variable is "numeric" if it only contains numbers, "category" otherwise). The dataset type is determined from the variable types.

  PLAIN EXPORT FILE: These files only have DATA lines. There is no NAMES or TYPES line. If the initial record of the file contains values other than strings (i.e., contains at least one number of symbol) the file is assumed to be a "Plain" export file. That is, it is assumed that the initial name and type records are absent, and that the first record contains data elements for the first row of data. The first element of each line is assumed to be a LABEL. Default values are used for the variable names, and the variable types are determined from the data (a variable is "numeric" if it only contains numbers, "category" otherwise). The dataset type is determined from the variable types. The datafile name is used as the name for the dataset.


